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Abstract 


Genetic recombination has frequently been observed in coronaviruses. Here, we sequenced 
multiple complete genomes of dromedary camel coronavirus HKU23 (DcCoV-HKU23) from 
Nigeria, Morocco and Ethiopia and identified several genomic positions indicative of cross 
species virus recombination events among other Betacoronaviruses of the subgenus 
Embecovirus (clade A B-CoVs). Recombinant fragments of a rabbit coronavirus (RbCoV- 
HKU14) were identified at the hemagglutinin esterase gene position. Homolog fragments of a 
rodent CoV were also observed at the 8.9 kDa open reading frame 4a at the 3’ end of the 
spike gene. The patterns of recombination varied geographically across the African region, 
highlighting a mosaic structure of DcCoV-HKU23 genomes circulating in dromedaries. Our 
results highlighted active recombination of coronaviruses circulating in dromedaries and is 
also relevant to the emergence and evolution of other Betacoronaviruses including MERS- 


coronavirus (MERS-CoV). 


Importance 


Genetic recombination is often demonstrated in coronaviruses and can result in host range 
expansion or alteration in tissue tropism. Here, we showed interspecies recombination events 
of an endemic dromedary camel coronavirus HKU23 with other clade A Betacoronaviruses. 
Our results supported the possibility that the zoonotic pathogen, MERS-CoV, which also co- 
circulates in the same camel species, may have undergone similar recombination events 


facilitating its emergence or may do so in its future evolution. 


Introduction 


Emerging infectious disease outbreaks usually arise by inter-species jumps of viruses 
between animal species, sometimes including humans. Coronaviruses have repeatedly made 
species jumps between animal species (e.g. SADS coronavirus from bats to swine) (1) and 
from animals to humans (2). Two human coronaviruses (HCoVs) 229E and OC43 now 
endemic in the human population emerged from camels and bovines respectively, within the 
past few hundred years (2, 3). SARS coronavirus emerged from bats via intermediate 
mammalian hosts in live game animal markets in Guangdong to spread to over 25 countries 


across 5 continents sickening almost 8000 people and leading to almost 800 deaths(4-6). The 
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ability of coronaviruses to make inter-species jumps is facilitated by complex virus-host 
interactions. High frequency of virus genetic recombination is one strategy for the virus to 


adapt to new host. 


Virus genetic recombination is frequently observed in coronaviruses and other positive sense 
RNA viruses. Murine hepatitis virus (MHV), a clade A B-CoV, is a well-studied example of 
homologous recombination, with up to 25% of its progeny in infected cells being 
demonstrated to be recombinants (7). High frequency of recombination is believed to be 
contributed by the large genome size, the intrinsic template-switching property of the viral 
RNA-dependent RNA polymerase (RdRp) during replication and the abundance of 
subgenomic RNA strands for template switching (8, 9). The role of RdRp in RNA 
recombination has been shown in poliovirus where a single amino acid residue mutation in 
the RdRp of poliovirus can result in a decrease in RNA recombination frequency (10). The 
exoribonucleases (ExoN) activity in replicase nonstructural protein (nsp) 14 of CoVs that 
constitutes the proofreading activity of genome replication has been suggested to be a 
potential regulator of RNA recombination (11). The presence of group-specific genes in 
CoVs is assumed to be a result of heterologous recombination, which involves exchange of 
non-homologus viral or cellular RNAs. The hemagglutinin esterase (HE) gene that is only 
expressed in clade A B-CoVs is believed to be acquired from influenza C virus through such 


heterologous recombination (12). 


In 2012, a novel respiratory pathogen, Middle East respiratory syndrome coronavirus 
(MERS-CoV), was isolated from a patient with severe respiratory illness in Jeddah, Saudi 
Arabia (13). It was a zoonotic virus found in dromedary camels which occasionally transmits 
to human (14-16). MERS-CoV is enzootic in dromedary camels in the Middle East as well as 
Africa with the greatest virus diversity found in Africa (17). Circulation of multiple lineages 
of clade B MERS-CoV in dromedary camels eventually resulted in a recombinant lineage 5 
virus that caused major outbreaks during 2015 both within Saudi Arabia and in South Korea, 
following introduction of the virus by a returning traveler (18). Recent studies have shown 
that two other coronaviruses, an alphacoronavirus dromedary camel coronavirus 229E 
(DcCoV-229E), and a B-CoV dromedary camel coronavirus HKU23 (DcCoV-HKU23) co- 
circulate in dromedaries in Saudi Arabia (18, 19). The co-circulation of at least three 
coronaviruses within camels provides an opportunity for the emergence of novel emerging 
infections via recombination. It is therefore important to investigate for evidence of 


recombination between coronaviruses co-circulating in dromedary camels because this may 
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have contributed to the emergence MERS-CoV and to future emergence of viruses of 
zoonotic and epidemic potential. Here we report the genetic diversity of DcCoV-HKU23 in 
the African camel population and identify several recombination events that have taken place 
with clade A B-CoVs, including bovine coronavirus (BCoV) and more distant species such as 
rabbit coronavirus (RbCoVHKU14) and rodent coronavirus (RodentCoV). We carried out 
our studies in West (Nigeria), East (Ethiopia) and North (Morocco) Africa because over 70% 
of the global population of dromedaries are found in Africa and this is likely where the 
greatest diversity of these dromedary coronaviruses is likely to be manifest and where 


MERS-CoV emerged. 


Methods & materials 
Sample collection. 


Nasal swabs and sera were collected from dromedary camels sampled in Nigeria, Morocco 
and Ethiopia in previous studies of MERS-CoV in 2015 and 2016 (17, 20). Camel nasal 
swabs were collected from a camel abattoir in Kano, Nigeria (n=2529) in 2015 and 2016 (17, 
21), from dromedary herds and abattoirs in Morocco (n=1569) in 2015 and 2016 and Ethiopia 
(n=621) in 2015 (20) (Table 1). The camels from Morocco and Ethiopia were mostly raised 
for meat, milk production or transport. Camel sera were concurrently collected from the 
abattoir in Nigeria (n=150) and from abattoirs and farms in both Ethiopia (n=100) and 
Morocco (n=100). Sampled camels were aged from | month-old to 20 years-old (median age 


of 3 years). 


DcCoV-HKU23 detection and genome sequencing. 


Total nucleic acid was extracted from swab samples using EasyMAG (bioMerieux, France) 
system. RNA was reverse transcribed into cDNA with random hexamers using PrimeScript™ 
RT reagent Kit (Perfect Real Time) (Takara, Japan), according to manufacturer’s protocol. 
cDNAs were screened for DcCoV-HKU23 using a broad-range pancoronavirus nested PCR 
assay designed to detect known and unknown CoVs targeting the consensus region of the 
RNA-dependent RNA-polymerase (RdRP) gene (22). RT-PCR positive amplicons were 
purified using the ExoSap-IT® reagent (USB, USA) and Sanger sequenced to identify the 
CoV identity. Samples with DcCoV-HKU23-like virus sequences were identified and 
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subjected to viral load quantification using a reverse transcription quantitative PCR (RT- 
qPCR) assay. Oligonucleotide sequences were designed to target both the N gene of DcCoV- 
HKU23 and Bovine CoV (Forward primer: 5’-GTCAATACCCCGGCTGAC-3’, Probe: 5’- 
(FAM)TCGGGACCCAAGTAGCGATGAGGC(BHQ)-3’ and Reverse primer: 5’- 
AACCCTGAGGGAGTACCG-3’). RT-qPCR reaction was performed using the TaqMan® 
Fast Virus 1-Step Master Mix (Thermo Fisher Scientific, USA), with the cycling protocol: 5 
min at 50°C for reverse transcription, followed by 20 seconds at 95°C and 40 cycles of 3 
seconds at 95°C and 30 s at 60°C. Samples with low cycle threshold (ct) values were selected 
for full genome sequencing. Reverse transcription with HKU23 specific primers targeting 
different regions of the genome were used to generate cDNA, which were subsequently 
amplified by PCR with primers designed to generate overlapping amplicons that can cover 
the whole genome. The primer sequences are available upon request. PCR amplicons from 
each sample were pooled for next generation sequencing and processed with Nextera XT 
library preparation kit following the protocol provided by the manufacturer. Sequencing was 
performed using the Illumina MiSeq instrument with read length of paired ends of approx. 
300bp. Raw sequence reads generated were mapped to a reference DcCoV-HKU23 genome 
(KF906250.1) using BWA (23). Sequence of the target virus was generated by taking the 
majority consensus of the mapped reads with sequencing coverage at each position of higher 


than 100 times. 


Genomic and Phylogenetic analysis. 


Opening reading frames (ORFs) of the virus genome encoding for proteins were predicted 
using ORF finder (NIH, USA). Full genome of DcCoV-HKU23 with previous sequences from 
Saudi Arabia, Bovine CoV and human CoV OC43 were aligned using MAFFT. Gaps and 
poorly aligned regions in the alignment were manually edited. Pairwise genetic distances 
were calculated using MEGA 7 (24). Phylogenetic analysis of DcCoV-HKU23 was 


performed by maximum likelihood method using IQ-Tree, version 1.6.8 (25). 


Recombination analysis was performed using Simplot version 3.5.1 (26). Bootscan analysis 
for a recombination event was performed on an alignment of the genome sequences as 
described above, with a 50% consensus sequence of 4 DcCoV-HKU23 in Nigeria with the 
same genotype C2/Coutlier/C3 (NV1010, NV1092, NV1097 and NV1385) as the query 
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sequence. A sliding window of 600 nucleotides and a step of 100 nucleotide was used as the 


scanning setting. 


Microneutralization assays. 


Heat-inactivated (56°C for 30 minutes) camel sera were first diluted 1:10, then serially two- 
fold diluted and mixed with equal volumes of virus at a dose of 200 50% tissue culture 
infective doses (TCIDs9) of DeCoV-HKU23 isolate 368F (27). After 1h of incubation at 37°C, 
35 wL of the virus—serum mixture was added in quadruplicate to HRT-18G cell monolayers 

in 96-well microtiter plates. After 1 h of adsorption, the virus—serum mixture was removed 
and replaced with 150 wL of virus growth medium to each well. The plates were incubated 

for 5 days at 37 °C in 5% CO2 in a humidified incubator. Cytopathic effect was observed at 
day 5 post-inoculation. The highest serum dilution protecting >50% of the replicate wells was 
denoted the neutralizing antibody titer. A virus back titration of the input virus was included 


in each batch of tests. 


Result 


Screening of DcCoV-HKU23 in African camels by rt-qPCR and microneutralization 


assay. 


Nasal swab samples of dromedary camels in Nigeria (n=2529), Morocco (n=1569) and 
Ethiopia (n=621) were tested for coronaviruses using the pan-CoV RT-PCR and identified by 
sequencing PCR amplicons (Table 1). (22). The overall prevalence of HKU23 viruses at each 
location ranged from 0.4% of 1569 samples tested in Morocco to 2.2% of 2529 from Nigeria 
(Table 1). A DcCoV-HKU23-specific quantitative real-time RT-PCR assay was subsequently 
performed to identify samples with a high viral RNA copy number for whole genome 


sequencing. 


In Morocco, DcCoV-HKU23 RNA positivity in young camels aged <2 years (n=584) was 
1.0% and not significantly different from adults (n=577) with 0.17% positive (Fisher’s exact 
test, P=0.124). In Nigeria, DcCoV-HKU23 RNA positivity in young camels was 4.1% 
(n=194) compared to 2.0% in adults (n=2335) (Fisher’s exact test, P=0.0674). In Ethiopia, 
young camels (n=136) had 1 positive swab while adults (n=314) had 5 positive swabs, which 
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was also without significant difference (Fisher’s exact test, P=0.673). Swab specimens were 


collected during the months October — April with virus detection in most months (Table 2). 


To study the seroprevalence of Dc-CoV-HKU23 in African camels, dromedary sera were also 
collected from a subset of camels during the same sampling occasions and were tested by 
micro-neutralization assay. A high seroprevalence was detected in dromedary camels in all 
three countries, with a seroprevalence of 92% of 150 sera in Nigeria, 91% of 100 sera in 
Ethiopia and 79% of 100 sera in Morocco respectively (Table 3). A lower seropositive rate 
was observed in younger (<2 years) compared to older Moroccan camels from abattoirs (48% 
vs. 92%; Fisher’s exact test, P=0.0036) and farms (76% vs. 100%; Fisher’s exact test, 
P=0.0223). There was no marked difference in seroprevalence of young and old camels in the 


Nigerian abattoir or in abattoirs or farms in Ethiopia. 


Cross-neutralizing antibody response of DcCoV-HKU23 and BCoV were evaluated by 
testing camel sera with high, medium, low and no neutralizing DcCoV-HKU23 titers by 
neutralization tests with BCoV-Mebus strain. There was significant correlation between titers 
of DcCoV-HKU23 and BCoV, suggesting likely serological cross-reactivity between the two 


viruses (Figure 1). 
Evolutionary divergence and genetic diversity of DcCoV-HKU23. 


Full genomes of DcCoV-HKU23 were obtained from four swab-samples in Nigeria (NV1010, 
NV1092, NV1097 and NV1385) and one sample each from Morocco (CAC2586) and 
Ethiopia (CAC1019). The African virus genomes were found closely related with pairwise 
base substitutions per site below 0.0270 (Table 4). These full genomes were compared with 
those previously reported from Saudi Arabia (18). DcCoV-HKU23 in the African region 
differed from those from Saudi Arabia by a range of 0.0223 — 0.0270 pairwise base 
substitutions per site, comparable to the divergence observed among the regions in Africa. 
Compared to other closely related species within clade A B-CoVs, they were distanced to 
BCoV by a range of 0.0249 — 0.0300 pairwise base substitutions per site and to HCoV-OC43 
by a range of 0.0445 — 0.0468 pairwise base substitutions per site. The closest species related 
to HCoV-OC43 remains BCoV, rather than DcCoV-HKU23. 


The genetic diversity of DcCoV-HKU23 across Africa and the Middle East was studied 
based on the ORF lab gene by a distance plot using SSE version 1.3 (28) (Figure 2). As 
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recombination has previously been shown to increase progressively from the 5’ to the 3’ end 
of the genome (29), ORF lab was selected to study the genetic diversity with minimal 
confounding by recombination. Along the position of ORF lab gene, a mean pairwise 
distance of about 0.01 was observed within the 6 DcCoV-HKU23 sequences from Africa and 
the 4 reference sequences available from Saudi Arabia. The observed diversity of DcCoV- 
HKU23 was comparable to BCoV, suggesting both viruses were introduced into their animal 
hosts at similar points in time. Another circulating CoV in the same camel populations, 
MERS-Co\V, was included in the analysis and showed a diversity of about 0.004 by the same 
analysis (Figure 2), relatively lower compared to DcCoV-HKU23. 


Phylogenetic analysis of DcCoV-HKU23 with BCoV and HCoV-OC43 sequences. 


To infer the phylogenetic relationship of the newly identified African DcCoV-HKU23 with 
previously reported DcCoV-HKU23 viruses from Saudi Arabia and bovine coronaviruses 
which are closely related, phylogenetic trees based on the complete coding sequences of 
RdRp (2783 nt), Spike (4101 nt) and nucleocapsid (1347 nt) gene were constructed. In 
addition to the 6 full genomes, four more virus sequences with complete RdRp, S, and N 
genes of DcCoV-HKU23 from Ethiopia (CAC1320, CAC1452) and Morocco (CAC2505, 
CAC2753), were obtained and included in the analysis. Using the genotyping nomenclature 
previously described for HCoV-OC43 and BCoV (30, 31) as a reference-point, our sequences 
of DcCoV-HKU23 in this study were mapped into the 3 main sub-clusters of BCoV, namely 
Cl, C2 and C3, in which Cl contains BCoVs from the Americas, C2 contains BCoVs from 
Europe and C3 contains the prototype BCoV (Figure 3). In the phylogenetic analysis of the 
RdRp gene, all the African and Saudi DcCoV-HKU23 clustered within clade C2, which 
includes BCoVs from Europe. African DcCoV-HKU23 sequences did not form a 
monophyletic clade with Saudi Arabia strain, instead these sequences were scattered within 
clade C2, suggesting a multiple ancestral origin of DcCoV-HKU23 across different 


geographic areas. 


The analysis of the spike gene showed that the 8 African and Saudi DcCoV-HKU23 were 
clustered together and grouped into a clade distinct from BCoV, which we designated as 
clade Couttier and has a basal phylogenetic relationship to the BCoV clades. The phylogeny of 
the viruses was geographically structured so that most of the sequences were grouped within 


the Coutiier clade into subclades of Saudi Arabia, Ethiopia and Western / Northern Africa 
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(Nigeria and Morocco) viruses. The phylogenetic tree of the spike gene resembles the region 
dependent diversity of MERS-CoV as observed in these camels, in which viruses from Africa 
and Middle East were grouped into two separate clades (17). However, two sequences from 
Morocco (CAC2505 and CAC2753) were distinct from other sequences and fell into the 


clade C2 of BCoVs, sharing a common ancestor with a cluster of BCoVs from France. 


The phylogeny of the N gene of DcCoV-HKU23 was more diverse with virus sequences 
being distributed in BCoV clade C3 which included 4 Nigerian sequences (NV1010, NV1092, 
NV1097 and NV1385), one Moroccan sequence (CAC2586) and two sequences from 
Ethiopia (CAC1320). These sequences clustered together monophyletically and were related 
to the human enteric coronavirus strain 4408. The other 3 sequences, Ethiopia CAC1019 and 
Morocco (CAC2753, CAC2505) were grouped within BCoV clade C2 together with the 


strains from Saudi Arabia. 


Combining the clade classification of these three gene regions, there were 3 circulating 
genotypes of DcCoV-HKU23, viz C2/C2/C2; C2/Couttie/C2 and C2/Couttier/C3 (Table 5) 
suggesting multiple genetic recombination occurred in the past. This contrasts with BCoV 
which does not appear to exhibit such genetic instability (31). However, there is a lack of 
BCoV sequence data from Africa. These recombinant genotypes C2/Couttie/C2 and 
C2/Couttie/C3 were observed in DcCoV-HKU23 across the African region without a distinct 
geographic pattern. Genotype C2/Coutlier/C2 was observed in one sample from Saudi Arabia 
and one from Ethiopia, while C2/Couttie/C3 were observed in 4 samples from Nigeria, one 
from Ethiopia and one from Morocco. The BCoV genotype C2/C2/C2 was observed in two 
Moroccan strains (CAC2753, CAC2505), possibly suggesting a direct spill-over of BCoV 


genotype C2 to the camel population. 


Distinct genetic region upstream of NS5a among DcCoV-HKU23. 


The genomic organization of African DcCoV-HKU23 is almost identical to Saudi Arabia 
DcCoV-HKU23 and BCoV, except for a 400nt region between S gene and NSS5a that was 
found to be highly divergent among DcCoV-HKU23 and other clade A B-CoVs (Figure 4). In 
BCoV, this region contains 2 ORFs (4a and 4b) that encode a size of 4.9 kDa and 4.8 kDa 
non-structural proteins respectively. The absence of ORF4a and 4b in HCoV-OC43 suggested 
they are not essential for viral replication (32). Pairwise comparison of this region among all 


DceCoV-HKU23 with BCoV-DB2, RbCoV HKU14 and HCoV-OC43 revealed nonsense 


10 
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mutations in both ORF4a and 4b in DcCoV-HKU23 from Saudi Arabia and Africa, resulting 
in premature stop codons and a truncated protein. A 200nt deletion was observed in NV1385 
after the premature stop codon of ORF4a. Similar, though not identical patterns of deletion 
were also found in RbCoV-HKU14 and HCoV-OC43 that resulted in more truncated protein 
sequences. Although these protein sequences are varied, the nucleotide sequences of DcCoV- 


HKU23/362F, DcCoV-HKU23/CAC1019, RbCoV-HKU14 and HCoV-OC43 in fact share a 


high pairwise similarity with the BCoV sequences that contain both full length ORF4a and 4b. 


Interestingly, DceCoV/NV1097 and DcCoV/CAC2586 contains a distinct ORF4a that encodes 
a 8.9 kDa non-structural protein among other DcCoVs in this region. Homology of this 
protein was BLAST searched and mapped to another 8.6 kDa non-structural protein encoded 
by a Rodent CoV RtMm-CoV-1/IM2014 (Accession No.: KY370052.1) with about 60% 
amino acid similarity (Figure 4b). The BCoV ORF4a and 4b have previously been suggested 
to be counterparts of the 11 kDa non-structural protein in mouse hepatitis virus (MHV) (33). 
The discovery of this rodent-like ORF4a encoded in DcCoV-HKU23 illustrated a possible 
homologous recombination with rodent coronaviruses that highlighted its distinct 


evolutionary history as compared to BCoV. 
Recombination analysis with other clade A B-CoVs. 


To study the possibility of cross species recombination, Bootscan analysis of full genomes of 
DcCoV-HKU23 in Africa with other clade A B-CoVs was performed using Simplot, version 
3.5.1. A multiple sequence alignment of the 6 African stain DcCoV-HKU23 with RbCoV- 
HKU14, PHEV, BCoV-DB2, EqCoV-NC99, RodentCoV-IM2014, DcCoV-HKU23 from 
Saudi Arabia and HCoV-OC43 was made. Using the Nigeria strain DcCoV-HKU23 as the 
query, recombination signals were observed with BCoV-DB2 at the NS2a gene (position 
21901-22600), with RoCoV-HKU14 at the hemagglutinin esterase (HE) gene (position 
22601-23600) and with the RodentCoV-IM2014 at the region of ORF4a, 4b and NSSa gene 
(position 27901-28800) (Figure 5a). Phylogenetic analysis of the BCoV signal region showed 
the BCoV-DB? clustered with the group of DcCoV-HKU23 from Africa (Figure 5b). The 
signal at NS2a gene extended the region showing the mixing of BCoV with DcCoV-HKU23, 
in addition to the RdRp and N gene. The tree of the RbCoV signal region showed the 
RbCoV-HKU14 changed its phylogenetic position and linked to the cluster of DcCoV- 
HKU23 from Nigeria and Ethiopia. This clustering suggests a recombination event between a 
common ancestor of DceCoV-HKU23 sequences from Nigeria or Ethiopia with the RbCoV- 
HKU14 or a RbCoV-like virus. The RodentCoV signal at the region of ORF4a and 4b 
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supported the homologous recombinant of West African DcCoV-HKU23 with a RodentCoV 
in the genetic organization analysis. A tree plotted from position 27901-28800 showed 
DcCoV-HKU23 were split into two separate evolutionary pathways (as illustrated as BCoV- 
like and RodentCoV-like in Figure 4a), in which West Africa DcCoV-HKU23 were clustered 
outgroup with RodentCoV-IM2014, while East Africa DeCoV-HKU23 were clustered with 
BCoV. 


Discussion 


Our data provides an enhanced understanding of the diversity and circulation of an endemic 
DcCoV-HKU23 in African dromedaries in East (Ethiopia), West (Nigeria) and North 
(Morocco) Africa in comparison with viruses in Saudi Arabia and for the evolutionary 
relationships between HKU23 and BCoV, an important pathogen of cattle (34). In this study, 
HKU23 viral RNA was detected in 2.2% of dromedary nasal swabs tested in Nigeria, 0.5% in 
Morocco and 1.4% in Ethiopia respectively. The rates were comparatively higher than a 
previous study with year-around sampling of camel nasal swabs done in Saudi Arabia where 
a virus detection rate of 0.2% was reported (18). We detected DcCoV-HKU23 in most 
sampling months from October to April, which were the only periods of the year we 
investigated suggesting there was no clear seasonality of virus activity. While co-circulation 
of three CoVs, MERS-CoV, DcCoV-229E and DcCoV-HKU23 were reported in Saudi 
Arabia camels, similar virus circulation was also observed in Nigeria, with a positive rate of 
2.2% for MERS-CoV (21) and 1.0% for DcCoV-229E (data not published). Serological 
prevalence of DcCoV-HKU23 antibodies in camels were 92% in Nigeria, 91% in Ethiopia 
and 79% in Morocco respectively, suggesting a widespread circulation of this or an 
antigenically related virus over a broad geographical area across Africa, in a manner 
comparable with MERS-CoV (20, 35). Younger camels had lower seropositive rates than 
adults in Morocco, but no age related difference was observed in Nigeria and Ethiopia. 
Differences in DcCoV-HKU23 seroprevalence may be attributed to the variations in the 
husbandry practices, co-habitant animal hosts or climatic factors between these countries. 
Similar seroprevalence were observed in camels in abattoirs and farms, suggesting virus 
circulation has already been established at the farm or herd levels and did not solely reflect 
amplification in the camel marketing chain. There is extensive cross neutralization of both 


DcCoV-HKU23 and BCoV, with a trend to higher titers to DcCoV-HKU23. The high amino 
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acid sequence identity of the Spike protein (92% - 97%) of DcCoV-HKU23 and BCoV very 


likely contributed the cross neutralization between the two viruses. 


The full genome sequences of DcCoV-HKU23 across the Africa and Saudi Arabia allowed us 
to study the genetic diversity of this virus in camel populations. The distance plot using the 
ORF lab gene can modestly evaluate the diversity within DcCoV-HKU23 sequences due to 
random mutations, with lesser effect contributed by recombination. The observed diversity of 
DcCoV-HKU23 was comparable to BCoV, suggesting both viruses have established in their 
hosts for a similar period of time. It is of interest to observe DcCoV-HKU23 to have a much 
higher diversity compared with MERS-CoV. At present there are three CoVs (Camel CoV 
229E, DcCoV-HKU23 and MERS-CoV) co-circulating in dromedary camels, the narrower 
genetic diversity of MERS-CoV possibly indicates a more recent introduction into camels or 


a purifying selection event in its more recent evolutionary history. 


Phylogenetic analysis of the full genomes of DcCoV-HKU23 in Africa with other published 
BCoV sequences available at Genbank identified incongruent topologies in the phylogenetic 
trees of RdRp, S and N, indicating events of recombination. With reference to the BCoV 
genotyping method described previously, DcCoV-HKU23 obtained in this study were 
classified into 3 genotypes: C2/C2/C2, C2/Coutiiee/C2 and C2/Couttier/C3. The Couttier clade has 
no BCoV sequences and is a uniquely DeCoV-HKU23 clade. Two DcCoVs-HKU23 from 
Morocco (CAC2505, CAC2753) showed a genotype C2/C2/C2, which is a non-recombinant 
variant of BCoV clade2, suggesting a BCoV evolutionary origin and the possibility of a 
BCoV spill-over into the Moroccan camel population. This is the first time that a non- 
recombinant BCoV variant has been detected in camels. Previously, BCoV has been detected 
in a wide range of ungulate hosts, including bovine, waterbuck, sambar deer, white-tail deer, 
alpaca, giraffe, stable antelope, buffalo and yak (36-39). The expansion of BCoV to camel 
hosts further illustrates its ability to cross species and infect other similar ungulate hosts. 
Recently, surveillance of BCoV has been expanded into regions of East Asia, including 
China (40), Korea (41)and Vietnam (42), and Caribbean in Cuba (43). The Spike gene of 
BCoV from these regions were all phylogenetically clustered into C1. Figure 6 summarizes 
the currently known geographic distribution of different genotypes of BCoV/HKU23-like 


viruses in camels, bovines and other species. 
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The phylogeny of the S gene of DcCoV exhibited a region dependent diversity that is also 
noted in BCoV and MERS-CoV. The Cl and C2 genotypes of BCoV correspond to the 
American-Asia and European cluster respectively. In this study, the C2 genotype of the two 
Morocco strains that clustered to BCoVs detected in France suggesting a divergence from the 
common ancestor of C2 genotype BCoVs. Other DcCoVs identified in this study as well as 
those previously identified in Saudi Arabia are recombinants, because their S gene showed an 
outgroup topology to BCoVs and HCoV-OC43, indicating DcCoV-HKU23 acquired its S 
gene through recombination with an ancestor yet to be identified. However, we cannot infer 
whether the recombination occurred prior or after the introduction of the virus to camels with 
the present data. The phylogeny of the N gene of DcCoV-HKU23 also varied among the 8 
recombinant sequences with Coutier genotype in the S gene. 7 DcCoV-HKU23 sequences 
(NV1010, NV1092, NV1097, NV1385, CAC1320, CAC1452 and CAC2586) were grouped 
to C3 with the N gene tree, indicating a close relatedness to prototype BCoV in the N gene. 
DcCoV-HKU23/CAC1019 from Ethiopia and DcCoV-HKU23 from Saudi Arabia were 
grouped to the C2 European clade. These C2 genotype sequences suggest a possible 
recombination event between the N gene of a non-recombinant BCoV strain with a 
recombinant DcCoV-HKU23 strain as the backbone. Overall, DcCoV-HKU23 exhibited a 
broader diversity that contrasted to the genetic stability as observed in BCoV. However, a 
limitation in the analysis is the lack of BCoV sequence data in Africa. With more BCoV 
genetic datasets from Africa, the recombination events of DcCoV-HKU23 and BCoV may be 


more clearly resolved. 


Cross species recombination of DcCoV-HKU23 was also observed with other clade A B-CoV 
species that involved a rabbit-CoV HKU 14-like virus. Rabbit-CoV 14 is a virus initially 
discovered through surveillance in China (44), but similar viruses may be present in a much 
wider geographic region. Bootscan analysis showed DcCoV-HKU23 from Nigeria and 
Ethiopia showed a recombination signal with RoCoV-HKU14-like virus at the position 
22601-23600, encoding the NS2a and HE gene. The HE gene in clade A B-CoVs has been 
suggested to be acquired by a heterologous recombination with influenza C virus. The 
recombination here was located in a similar region suggesting a possible recombination 
hotspot. The phylogeny of the signal region showed the RbCoV sequence was linked to the 
DcCoV-HKU23 sequences from Ethiopia and Nigeria at a basal position, suggesting the 


recombination event may have occurred with the ancestral sequence from these two regions. 
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The HE protein encoded in clade A B-CoVs plays a role in the receptor binding through its 
receptor binding or receptor destroying activity to glycan components (45). An example of 
the functional property of HE has been demonstrated in the adaptation of HCoV-OC43 and 
HKU 1 to infect humans, when the HE lectin domain was progressively lost through 
accumulated mutations (46). The diversity of HE gene between parental and recombinant 
DcCoVs deserves further research to characterize the glycoconjugates targeted by the 
receptor destroying activity and study how such virion-glycan interactions may contribute to 


its host tropism. 


We further identified multiple mutations in the downstream ORF4a and 4b encoded between 
S and NSSa gene that provided insight on the evolutionary origin of clade A B-CoVs. The 
deletion patterns of ORF4a and 4b observed in DeCoV-HKU23, RbCoV-HKU14 and HCoV- 
OC43 revealed a stepwise deletion among these sequences. While these patterns may suggest 
a BCoV-origin of these sequences, it is also possible that an ancestral virus infected multiple 
hosts and bovines preferentially retained those ORFs. Nonsense mutations and deletions of 
these ORFs in DcCoV-HKU23, RbCoV-HKU14 and HCoV-OC43 supported the contention 
that this region may not contain essential genetic sequences and the loss of such genetic 
information will not impair virus fitness in dromedaries. In fact, ORF4a and 4b in BCoV has 
previously been suggested as vestiges of an 11-kDa protein encoded by mouse hepatitis virus 
(MHYV) resulting from a nonsense mutation in the middle of the ORF4 (33). The region 
between the S and E gene may suggest a mouse or murine CoV origin of this region. As 
additional evidence, we also observed another 8.9 kDa protein encoded by the ORF4a in 
DcCoV-HKU23 identified in this study which mapped to a similar protein in rodent CoV 
with 60% amino acid similarity. Bootscan analysis showed DcCoV from Morocco (CAC2586) 
and Nigeria (NV1010, 1092, 1097 & 1385) were phylogenetically outgrouped with rodent 
CoV at the position from 27901-28800, suggesting a possible homologous recombination. 
These sequences altogether illustrated a multiple origin of clade A B-CoVs from rodent like 
species. Recent surveillance of rodent species have identified many more novel CoV species, 
including ChRCoV-HKU24, LAMV, LRLV and rodent CoV (47-49). These sequences are 
phylogenetically positioned at the deep branch rooting members of clade A B-CoVs. The 
discovery of sequence remnants of rodent CoV in DcCoV-HKU23 further support the 


involvement of rodent like species in the evolutionary history of clade A B-CoVs (47). 
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The occurrence of the recombination events identified in this study requires the co-infection 
of two or more different CoVs in the same host and same cell and this requires the parental 
viruses to be co-circulating in the same geographic region. The wide host range and 
geographic range of BCoV-like viruses could potentially facilitate recombination with other 
coronaviruses. Recombination events that led to the emergence of SARS-like CoV likely 
occurred in its natural reservoirs in bats, and the virus then spilled over to intermediate hosts 
such as civets, raccoon dogs and humans. Since rodents harbor the highest diversity of clade 
A B-CoVs (49), one may speculate that it may be possible for recombination events to occur 
in rodents with the recombinant virus subsequently spilling over to other mammalian hosts if 
it has competitive advantages over pre-existing strains. Thus, one may speculate that the 
recombination of DcCoV-HKU23 with RabbitCoV-HKU 14 and RodentCoV-IM2014 could 


have occurred in a rodent species. 


Limitations of this study include the lack of availability of rectal swabs to evaluate the tissue 
tropism associated with DcCoV-HKU23 infection. Although the virus seems not to cause 
significant disease as it was detected in apparently healthy camels in abattoirs, it is unclear 
that whether infection is limited to the upper respiratory mucosa similar to MERS-CoV or 
whether it spreads more systemically. Specimens from sites of the body other than the upper 
respiratory tract may provide information on the tropism of the virus. The lack of year round 
sampling precludes conclusions on the seasonality of virus activity. On the other hand, the 
strength of the study is sampling across East, West and North Africa which allows an 
understanding of the virus diversity across a large geographic region. The lack of BCoV 
sequences from Africa precludes a more definitive analysis of the origins of the different 


genotypes of HKU23. 


In conclusion, the study showed a mosaic structure of DcCoV-HKU23 that is likely to be 
contributed by several recombination events among clade A B-CoVs. Among the three 
identified DcCoVs that circulate in dromedary camels, MERS-CoV has so far demonstrated 
intraspecies recombination, while DcCoV-HKU23, in addition, further demonstrated inter- 
species recombination. Our study highlighted the importance of studying recombination of 
CoVs to understand its evolutionary history and cross species transmission of coronaviruses 


in dromedaries. 
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Table 1. Screening of DcCoV-HKU23 in dromedary camels from Nigeria, Ethiopia and 


Morocco. 
Country Sampling Sample positive / tested * Ct range Field collections 
year (% positive) reported in Ref. 
Nigeria 2015, 2016 Total: 55/2529 (2.2%) 18.8-36.9 Soetal,2018 
Young camels: 8/194 (4.1%) 
Adult camels: 47/2335 (2.0%) 
Morocco 2015, 2016 Total: 7/1569* (0.45%) 26.5 — 33.4 Miguel et al, 2017 
Young camels: 6/584 (1.0%) 
Adult camels: 1/577 (0.17%) 
Ethiopia 2015 Total: 9/621* (1.4%) 23.2 —30.2 Miguel et al, 2017 


Young camels: 1/136 (0.74%) 
Adult camels: 5/314 (1.6%) 


«Note: Age information were not available for all sampled camels. 
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Table 2. The monthly RNA rate of Dc-CoV-HKU23 in camels from Africa and Saudi Arabia. 


RNA rate (%) 

Month 2014-2015 2015-2016 

Morocco Ethiopia Nigeria Morocco 
May - September ND ND ND ND 
October ND ND 1/526 (0.2%) ND 
November ND ND 15/739 (2.0%) ND 
December ND ND 1/35 (2.9%) ND 
January ND 2/120 (1.7%) 12/531 (2.3%) 0/349 (0%) 
February 0/195 (0%) 7/501 (1.4%) 26/698 (3.7%) ND 
March 0/186 (0%) ND ND 5/385 (1.3%) 
April ND ND ND 2/453 (0.4%) 
ND, No data; 
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Table 3. The seropositive rate of Dc-CoV-HKU23 in camel sera from Nigeria, Ethiopia and Morocco by micro-neutralization assay. 


Camel age group 


Seropositive rate 
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Table 4. Estimates of evolutionary divergence between the complete genome sequences of DcCoV-HKU23 identified in Africa (Nigeria, Ethiopia and 


Morocco) and Saudi Arabia, Bovine CoV and Human CoV-OC43. Analyses were conducted using the Tamura-Nei model using MEGA7. 


Genome 


Pairwise evolutionary divergence in the number of base substitutions per site 


Host County, Stain size (bp) CC NVi0I0 | NVi092 | NV1097 | NVi385 | CACIOI9 | CAC2586 |  368F 362F 265F Ryl23 DB2 O43 
Camel Nigeria DcCoV-HKU23-NV1010 30780 36.90% - 
DcCoV-HKU23-NV1092 30799 36.90% 0.0001 - 
DcCoV-HKU23-NV1097 31075 37.00% 0.0167 0.0168 - 
DcCoV-HKU23-NV1385 30798 36.90% 0.0004 0.0004 0.0168 - 
Ethiopia DcCoV-HKU23- 31021 36.90% 0.0206 0.0207 0.0246 0.0206 - 
CAC1019 
Morocco DcCoV-HKU23- 31062 37.00% 0.0191 0.0191 0.0129 0.0191 0.0270 - 
CAC2586 
Saudi DcCoV-HKU23-368F 31052 37.00% 0.0263 0.0265 0.0231 0.0263 0.0223 0.0225 
Arabia KF906251.1 
DcCoV HKU23-362F 31052 37.00% 0.0263 0.0265 0.0231 0.0263 0.0223 0.0225 0.0000 - 
KF906250.1 
DcCoV-HKU23-265F 31052 37.00% 0.0264 0.0266 0.0235 0.0264 0.0227 0.0229 0.0017 0.0017 - 
KF906249.1 
DcCoV-HKU23-Ry123 31041 37.00% 0.0269 0.0270 0.0237 0.0268 0.0229 0.0231 0.0017 0.0017 0.0026 - 
KT368891.1 
Bovine BCoV-DB2 31007 37.10% 0.0299 0.0300 0.0288 0.0299 0.0249 0.0279 0.0197 0.0197 0.0200 0.0204 - 
DQ811784.2 
Human HCoV-OC43 30738 36.80% 0.0466 0.0467 0.0468 0.0468 0.0445 0.0463 0.0420 0.0420 0.0421 0.0425 0.0333 - 
AY391777.1 
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Table 5. Summary of the genotypes of DcCoV-HKU23 identified in this study. 


RdRp S N Genotype 
DcCoV-HKU23 Nigeria NV1010 C2 C outlier C3 C2/C outlier/C3 
NV1092 C2 C outlier c3 C2/C outlier/C3 
NV1097 C2 C outlier C3 C2/C outlier/C3 
NV1385 C2 C outlier c3 C2/C outlier/C3 
Ethiopia CAC1019 C2 C outlier C2 C2/C outlier/C2 
CAC1320 C2 C outlier C3 C2/C outlier/C3 
CAC1452 C2 C outlier C3 C2/C outlier/C3 
Morocco CAC2505 c2 C2 c2 C2/C2/C2 
CAC2586 C2 C outlier C3 C2/C outlier/C3 
CAC2753 c2 c2 c2 C2/C2/C2 
Saudi 362F C2 C outlier C2 C2/C outlier/C2 
Arabia 
BCoV Europe BCoV/FRA C2 C2 C2 C2/C2/C2 
Americas BCoV ENT C1 C1 C1 C1/C1/C1 
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Figure 1. Scatter plot showing camel sera (n=13) with different neutralizing titres against DcCoV HKU23 
were tested for cross neutralization against BCoV. 
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Figure 2. Comparison of the genetic diversity of the ORFlab gene of DcCoV-HKU23, BCoV and MERS-CoV. 


Sequence distance plot were generated using SSE version 1.3 using a sliding window of 250 and a step 
size of 25 nucleotides. Sequences were obtained from Genbank database and closely related sequences 
with a pairwise distance < 0.001 were excluded in the analysis. A total of 10 DcCoV-HKU23 sequences, 
28 BCoV sequences and 88 MERS-CoV sequences were included in the analysis. 


Pairwise distance 


° 1500 3000 4500 6000 7500 9000 10500 12000 13500 15000 16500 18000 19500 21000 22500 24000 
ORF iab alignment position (22650 bp) 
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Figure 3. Phylogenetic analysis of a) RdRp, b) spike and c) nucleocapsid gene of DcCoV-HKU23 from 
Nigeria (colored blue), Morocco (colored red) and Ethiopia (colored green). Reference DcCoV-HKU23 
sequences from Saudi Arabia were colored brown. Alignment of each gene is manually trimmed to 


obtain an alignment of 2769nt for RdRp, 4137nt for spike and 1347nt for nucleocapsid gene respectively. 


Tree was constructed by maximum likelihood method using I|Q-Tree with the best-fit model 
automatically selected by ModelFinder. Nodes indicated bootstrap values calculated using ultrafast 
bootstrap with 1000 replicates. Trees were mid-point rooted. 
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Figure 4. A) Genomic organization of DcCoV-HKU23 and other clade A B-CoVs of the region between spike gene and NS5a. Distinct ORFs patterns were 
found in DcCoV-HKU23 in this region. Stop codons in the ORFs were labelled by black triangle. Horizontal dotted lines indicated region of deletion. B) The 
amino acid sequence alignment of the 8.9kDa ORF4a of rodent coronaviruses (RtAs-CoV/IM2014, accession no. KY370044; RtMm-CoV-1/IM2014, accession 
no. KY370052; RtMruf-CoV-2/JL2014, accession no. KY370046) and DcCoV-HKU23. 
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Figure 5. A) Recombination analysis of the genomes of DcCoV-HKU23, BCoV-DB2, PHEV, HCoV-O0C43, RbCoV-HKU14, EquineCoV-NC99 and RodentCoV- 
RtMm-CoV-1/IM2014. Bootscan analysis was performed by Simplot, version 3.5.1, using a 50% consensus sequence of the DcCoV-HKU23 in Nigeria with the 
genotype C2/Coutlier/C3 as the query. B) Phylogenetic trees from representative regions were constructed by maximum likelihood method using !Q-Tree, 
version 1.6.8. Trees were midpoint rooted. Accession number of the CoVs used in this analysis: DcCoV-HKU23/362F (KF906250.1), BCoV-DB2 (DQ811784.2), 
PHEV (KY994645), HCoV-OC43 (AY391777.1), RbCoV-HKU14 (JN874559), EquineCoV-NC99 (EF446615) and RodentCoV-IM2014 (KY370052). 
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Figure 6. The geographic distribution of different genotypes of BCoV/HKU23-like viruses in camels, 


bovines and other species. The map was drawn using R software. 
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